4 research outputs found

    Smart Yoga Assistant: SVM-based Real-time Pose Detection and Correction System

    Get PDF
    SVM-based Real-time Pose Detection and Correction System refer to a computer system that uses machine learning techniques to detect and correct a person's yoga pose in real-time. This system can act as a virtual yoga assistant, helping people improve their yoga practice by providing immediate feedback on their form and helping to prevent injury. This paper presents a yoga tracker and correction system that uses computer vision and machine learning algorithms to track and correct yoga poses. The system comprises a camera and a computer vision module that captures images of the yoga practitioner and identifies the poses being performed. The machine learning module analyzes the images to provide feedback on the quality of the poses and recommends corrections to improve form and prevent injuries. This paper proposed a customized support vector machine (SVM) based real-time pose detection and correction system that suggests yoga practices based on specific health conditions or diseases. Paper aims to provide a reliable and accessible resource for individuals seeking to use yoga as a complementary approach to managing their health conditions. The system also includes a practitioner’s interface that enables practitioners to receive personalized recommendations for their yoga practice. The system is developed using Python and several open-source libraries, and was tested on a dataset of yoga poses. The hyper parameter gamma tuned to optimize the classification accuracy on our dataset produced 87% which is better than other approaches. The experiment results demonstrate the effectiveness of the system in tracking and correcting yoga poses, and its potential to enhance the quality of yoga practice

    Personalization for BERT-based Discriminative Speech Recognition Rescoring

    Full text link
    Recognition of personalized content remains a challenge in end-to-end speech recognition. We explore three novel approaches that use personalized content in a neural rescoring step to improve recognition: gazetteers, prompting, and a cross-attention based encoder-decoder model. We use internal de-identified en-US data from interactions with a virtual voice assistant supplemented with personalized named entities to compare these approaches. On a test set with personalized named entities, we show that each of these approaches improves word error rate by over 10%, against a neural rescoring baseline. We also show that on this test set, natural language prompts can improve word error rate by 7% without any training and with a marginal loss in generalization. Overall, gazetteers were found to perform the best with a 10% improvement in word error rate (WER), while also improving WER on a general test set by 1%

    Low-rank Adaptation of Large Language Model Rescoring for Parameter-Efficient Speech Recognition

    Full text link
    We propose a neural language modeling system based on low-rank adaptation (LoRA) for speech recognition output rescoring. Although pretrained language models (LMs) like BERT have shown superior performance in second-pass rescoring, the high computational cost of scaling up the pretraining stage and adapting the pretrained models to specific domains limit their practical use in rescoring. Here we present a method based on low-rank decomposition to train a rescoring BERT model and adapt it to new domains using only a fraction (0.08%) of the pretrained parameters. These inserted matrices are optimized through a discriminative training objective along with a correlation-based regularization loss. The proposed low-rank adaptation Rescore-BERT (LoRB) architecture is evaluated on LibriSpeech and internal datasets with decreased training times by factors between 5.4 and 3.6.Comment: Accepted to IEEE ASRU 2023. Internal Review Approved. Revised 2nd version with Andreas and Huck. The first version is in Sep 29th. 8 page
    corecore